You know the drill, quiz in five minutes?
Okay.
By the way, do you all have an eye on the Stodon forums?
Because there were 11 people who for the last quiz were logged in with their email account with no way for us associating them with like their names and matricle numbers and stuff.
Which means if you did that, we can't give you the bonus points until we know who you are.
So I made a Stodon post. If those 11 people at some point notice, please let me know so that we can give you the proper bonus points at some point.
Okay.
Yeah, clearly it's the one with normalization that apparently did not work for anyone.
At least that makes like computing the actual like results very easy.
I just opened a post on Stodon. Anyone who had any kind of other problems regarding login or like getting the other black screens, please reply there.
Ideally, if you still have access to either like the JavaScript error message in your browser or a screenshot, anything that might help us whatsoever with diagnosing this, please add it there if at all possible.
Okay.
Then let's start by recapping what we did last time. We were basically finished with a pure math set up with respect to probability spaces and so on and so forth.
We will from now on almost exclusively work with distributions instead, where the distribution for a random variable is just the vector of the individual probabilities for all of the possible outcomes of that particular random variable.
That works only, of course, if we assume that our random variables have a finite domain, which we're going to basically assume for the remainder of the course in almost all cases, which just means a distribution for random variable is just like the ordered list of all the individual probabilities.
We can, given multiple random variables, talk about the full joint probability distribution, which in that case is just the n dimensional array that contains all of the probabilities of all of the conjunctions of all possible values of all of the domains of all of the random variables.
So, as you can clearly see that one explodes rather quickly, and of course we can do the same with conditional probabilities given a conditional probability.
We can give a conditional probability distribution, which just contains the individual conditional probabilities for every combination of the random variables involved.
And from now on we're basically going to work with those exclusively.
By convention we're now also going to just write down all of the equations that we're interested in in the form of distributions.
Every time you see any equation that contains a distribution, just assume that it represents a system of equations for every possible assignment of the variables involved in the distribution.
You can think of that in two ways. The one way is literally just as a sequence of equations.
The other one is as an equation that gives you a vector where on the right side of the equation is just the component-wise way to compute the individual components of that vector.
Is there a good example here? In this case there isn't.
But if you were to assume that for example we would have an actual value for y here, so if we were to say something like distribution of x, y equals lowercase y,
that would mean we can compute the vector of probabilities for every element of the domain of x by doing this for every one of the components.
Ultimately either way it's just a shorthand for a system of equations. Why do we do that?
Well it makes things easier and the important thing to note is that if we know the full joint probability distribution of a bunch of random variables,
then we know basically any kind of probability on any kind of combination of the random variables or outcomes thereof that we might possibly be interested in.
Of course that also means we've seen that earlier the full joint probability distribution gets really big really quickly,
so ultimately we are basically never in the position where we actually want to represent the full joint probability distribution,
but we're going to basically just use it to do math, to write down our equations,
and if we can derive the full joint probability distribution by some kind of like arithmetic expressions,
then we know that we can compute all of the individual entries.
And basically the remainder of at least this section is now going to be how to compute as many of them as possible
without ever having to actually write down the whole full joint probability distribution.
Probabilistic reasoning is exactly what allows us to do that.
Basically we compute the probabilities of certain events from the probabilities of other events,
usually the ones on some form of conditional probability tables that we know,
and use those to do things like compute arbitrary entries in the full joint probability distribution.
One good example for that is the Naive Bayes model.
It's the simplest possible model that allows for any kind of conditional reason and probabilistic reasoning in the first place.
We assume we have one cause and we have a bunch of effects, in this case only two.
We draw these little arrows to indicate that this thing causes that thing,
and by convention assume that if we draw a diagram such as this,
we mean to say that given the cause all of the events are conditionally independent,
or any conjunction of the effects are conditionally independent.
We call that a Naive Bayes model in the case where we have one cause and a bunch of effects.
Now a lot of the things we're going to do in this section basically follow the same cooking recipe throughout.
Whenever you're in some kind of situation where you have a bunch of random variables
Presenters
Zugänglich über
Offener Zugang
Dauer
01:22:22 Min
Aufnahmedatum
2024-04-30
Hochgeladen am
2024-05-02 01:09:07
Sprache
en-US